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Abstract —A model of an Ant System where ants are controlled by a 
spiking neural circuit and a second order pheromone mechanism in a 
foraging task is presented. A neural circuit is trained for individual ants 
and subsequently the ants are exposed to a virtual environment where a 
swarm of ants performed a resource foraging task. The model comprises 
an associative and unsupervised learning strategy for the neural circuit 
of the ant. The neural circuit adapts to the environment by means 
of classical conditioning. The initially unknown environment includes 
different types of stimuli representing food (rewarding) and obstacles 
(harmful) which, when they come in direct contact with the ant, elicit a 
reflex response in the motor neural system of the ant: moving towards 
or away from the source of the stimulus. The spiking neural circuits 
of the ant is trained to identify food and obstacles and move towards 
the former and avoid the latter. The ants are released on a landscape 
with multiple food sources where one ant alone would have difficulty 
harvesting the landscape to maximum efficiency. In this case the in¬ 
troduction of a double pheromone mechanism (positive and negative 
reinforcement feedback) yields better results than traditional ant colony 
optimization strategies. Traditional ant systems include mainly a positive 
reinforcement pheromone. This approach uses a second pheromone 
that acts as a marker for forbidden paths (negative feedback). This 
blockade is not permanent and is controlled by the evaporation rate of 
the pheromones. The combined action of both pheromones acts as a 
collective stigmergic memory of the swarm, which reduces the search 
space of the problem. This paper explores how the adaptation and 
learning abilities observed in biologically inspired cognitive architectures 
is synergistically enhanced by swarm optimization strategies. The model 
portraits two forms of artificial intelligent behaviour: at the individual level 
the spiking neural network is the main controller and at the collective 
level the pheromone distribution is a map towards the solution emerged 
by the colony. The presented model is an important pedagogical tool 
as it is also an easy to use library that allows access to the spiking 
neural network paradigm from inside Netlogo—a language used mostly 
in agent based modelling and experimentation with complex systems. 


1 Introduction 

The exploration of artificially constructed entities that 
simulate biology counterparts has a long tradition in 
the field of Artificial Intelligence, and in particular in 
Artificial life. Langton proposed the study of artificial life 
with celular automata in 1996 (2 where the author aimed 
to "implement the 'molecular logic of the living state' in 
an artificial biochemistry environment" via modeling the 
artificial molecules as "virtual automata that were able 
to roam in an abstract computer space and interact with 
each other". 

Ant colony based algorithms have been applied suc¬ 
cessfully to several domains, namely for the clustering 
of web usage mining 0, image retrieval (3l, newspaper 
and document organisation 151, l5l , and mainly to the 
travelling salesman problem 0, 0, |8), |9), |10|, [11], 
among other combinatorial optimisation problems. 

In nature, many ant species have used trail-laying 
when foraging. They deposit pheromone, a chemical 
substance that is volatile and is secreted by the ants 
when returning to the nest carrying food, that acts as 
a recruitment mechanism. This pheromone is detected 
by other colony ants that can use it as an indicator 
for the food source. This process of ant recruitment is 
a positive reinforcement mechanism as more ants are 
driven to the foraging task and subsequently deposit 
more pheromone. This positive amplification allows the 
colony to exploit the food source in optimal time. In the 
model presented here the only process of recruitment 
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is based on chemicals trails and therefore is refereed as 
mass recruitment |6). 

Social insects are very well adapted to solve the for¬ 
aging problem. The resilience of the species relies on 
the flexibility presented by social insects to changing 
problem landscapes. As sources of food are depleted 
their strategy needs to allow them to find newer sources 
and they need to adapt quickly to new situations. The 
use of pheromones allows them to collectively adapt to 
changing environments, and the colonies become robust 
entities even if some individuals fail to perform their 
tasks. This "swarm intelligence" reflects the fact that 
these social insects are capable of self-organisation. Self¬ 
organisation is a process, whereby the dynamics of the 
parts of the system, a high order structure emerges. Self¬ 
organisation systems like ant colonies present defining 
features: they exhibit some kind of positive feedback 
mechanism, they exhibit also negative feedback mecha¬ 
nisms, they amplify fluctuations observed in nature and 
the self-organisation relies on the existence of multiple 
interactions. Although one single ant can deposit a trail 
of pheromone, the benefit of these trails is only observed 
when many ants interact together via the recruitment 
process. 

While ants can interact with other ants in a direct 
way, via antennation or prophallaxis for example, in 
this model we only model indirect interaction via the 
environment. This process is known as stigmergy. One 
ant deposits pheromones at a certain time and in a 
certain location and other ants will interact with the 
deposited pheromone at a latter time. 

Following this introduction, section [5] presents the 
model in its constituent parts. Subsections 2.1 and |2.2| 
discuss the use of spiking neural networks—compared 
to traditional artificial neural networks and in the con¬ 
text of controllers for autonomous systems, respectively. 
Subsection |2.3| presents the spiking neural network sub¬ 
model while subsection |2]4] shows the brain of the virtual 
ant. It is followed in subsection |2.5| by the description 
of the double pheromone mechanism and section [2] 
concludes with details of the implementation in sub¬ 
section |2.6| Illustrative results are presented in section 
[3] and it is followed by the conclusion in section [4] A 
simplified version of the spiking neural network is avail¬ 
able for download at http : //modelingcommons . 
org/browse/one_model/4 4 55, 


2 A Model for Internal+External Arti¬ 
ficial Intelligence 
2.1 Spiking versus traditional Neural Networks 

In traditional Artificial Neural Networks (ANNs) models 
(e.g. McCulloch-Pitts and sigmoidal neurons) the neural 
activity is represented by the neurons firing rates, thus 
the timing between single pulses (Pulse code) is not 
taken into account da. On the other hand. Spiking 
Neural Networks (SNNs) are able to represent neural 
activity in terms of both rate and pulse codes. The pulse 


code which makes use of the information contained in 
the interspike interval has been associated with fast in¬ 
formation processing in the brain in cases where the time 
required for the integration and computation of average 
rates (rate code) would take too long (i.e., a housefly 
can change the direction of its flight in reaction to 
visual stimuli in about 30 milliseconds) Ifl2l . In addition 
to the enhanced computational capabilities added by 
the time dimension, the resemblance of SNNs dynamics 
with their biological counterparts is significantly more 
accurate than in first and second generations ANNs. 
SNNs are capable of simulating a broad range of learning 
and spiking dynamics observed in biological neurons 
including: Spike timing dependent plasticity (STDP), 
long term potentiation and depression (LTP/LTD), tonic 
and phasic spike, inter-spike delay (latency), frequency 
adaptation, resonance and input accommodation (131 . 
The STDP rule mentioned above, is implemented in 
the proposed neural circuit as the underlying mecha¬ 
nism for associative and classical conditioning learning. 
Experimental results have demonstrated that different 
types of classical conditioning (i.e., Pavlovian, extinction, 
partial conditioning, inhibitory conditioning) can be im¬ 
plemented successfully in SNNs. [14], (15]. 


2.2 Suitability of Spiking neural networks for con¬ 
trolling autonomous systems (robots and agents) 

The Capabilities shown by insects (given their lower 
neural complexity compared to vertebrates) to interact 
and cope with the environment including: exploration, 
reliable navigation, pattern recognition and interactions 
with each other, are being considered as key features for 
implementation in the design of autonomous robots [16]. 
Based on the fact that SNNs reproduce to some extent 
the computational characteristics of biological neural 
systems, SNNs prove to be a potential computational in¬ 
strument to achieve the above mentioned features in arti¬ 
ficial systems. There is increasing research (e.g., [14], [16], 
[HZL EO ) on the use of SNNs to control autonomous 
systems which exhibit intelligent behaviour in terms of 
learning and adaptation to the environment. Wang et 
al. m compares the implementation of SNNs with tra¬ 
ditional ANNs autonomous controllers highlighting the 
following advantages in the SNN approach: (1) Espatio- 
temporal information is used more efficiently in SNN 
than in ANNs. (2) The topology of the SNNs controller is 
much simpler than in ANNs. (3) The (hebbian) training 
method in the SNNs was easier to implement than in 
ANNs. SNNs are demonstrating that their application 
as artificial neural controllers in autonomous systems is 
not only advantageous in computational terms (when 
compared with previous connectionists models) but it 
also allows the implementation of biologically inspired 
neural systems (e.g., [16], [15], [19]) to be used in ma¬ 
chines. 
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2.3 The Spiking Neural Network (SNN) model 


The SNN model implemented in Netlogo l20l is a simpli¬ 
fied but functional version of integrate and fire neuron 
models nia eh aimed at pedagogical purposes and 
experimentation with small spiking neural circuits. The 
artificial neuron is implemented as a finite-state machine 
where the states transitions depend on a variable which 
represents the membrane potential of the cell. All the 
characteristics of the artificial neuron including: (1) mem¬ 
brane potential, (2) resting potential, (3) spike threshold, 
(4) excitatory and inhibitory postsynaptic response, (5) 
exponential decay rate and (6) absolute and refractory 
periods, are enclosed in two possible states: open and 
absolute-refractory. 


input spikes 



open state open state 



Fig. 1. Modeling of the membrane potential in the imple¬ 
mented SNN model 


In the open-state the artificial neuron is receptive to 
input pulses coming from presynaptic neurons. The am¬ 
plitude of postsynaptic potentials elicited by presynaptic 
pulses is given by the function psp() (see figure 1). The 
membrane perturbations reported by psp() are added 
(excitatory postsynaptic potential EPSP) or subtracted 
(inhibitory postsynaptic potential IPSP) to the actual 
value of the membrane potential u. If the neuron firing 
threshold d is not reached by u, then u begins to decay 
(decay () function in figure 1) towards a fixed resting 
potential. On the other hand, if the membrane potential 
reaches a set threshold, an action potential or spiking 
process is initiated. In the used model, when u reaches 
the firing threshold d, this triggers a state transition 
from the open to the absolute-refractory state. During the 
latter, u is set to a fixed refractory potential value a v and 
all incoming presynaptic pulses are neglected by u. Fig. 
[l] illustrates the behaviour of the membrane potential in 
response to incoming presynaptic spikes. 



Fig. 2. Neural circuit controller of the virtual ant 


2.4 The brain for the virtual ant 

2.4.1 Learning with the Spike Timing Dependent Plas¬ 
ticity (STDP) rule 

In this paper, the STDP model proposed by Gerstner 
et al. 1996 Ii22l has been implemented as the underly¬ 
ing learning mechanism for the ants neural circuit. In 
STDP the synaptic efficacy is potentiated or depressed 
according to the difference between the arrival time of 
incoming pre-synaptic spikes and the time of the action 
potential triggered at the post-synaptic neuron. 

The following formula l22l describes the weight 
change of the synapse according to the STDP model for 
pre-synaptic and post-synaptic neurons j and i respec¬ 
tively: Here, the arrival time of the pre-synaptic spikes 
is indicated by tj while t - 1 represents the firing time at 
the post-synaptic neuron: 

N N 

^ = (!) 
j =1 n=l 

The weight change resulting from the combination of 
a pre-synaptic spike with a post-synaptic action potential 
is given by the function W(At) [22 j. 


W(At) = 


A + exp (Af/r+), 
—A- exp (—At/r_), 


if At < 0 
if At > 0 


(2) 


where At is the time difference between the arriving 
pre-synaptic spike and the post-synaptic action potential. 

and determine the amplitude of the weight 
change when increasing or decreasing respectively. r + 
and r_ determine the reinforce and inhibitory interval 
or size of the learning window. 


2.4.2 Topology of the neural circuit 

The neural circuit presented in this work (Fig. [2| enables 

a simulated ant to move in a two dimensional world. 
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learning to identify and avoid noxious stimuli while 
moving towards perceived rewarding stimuli. At the 
beginning of the training phase, the ant is not aware 
of which stimuli are to be avoided or pursued. Learning 
occurs through reward-and-punishment classical condi¬ 
tioning [23j. Here the ant learns to associate the informa¬ 
tion represented by different colours with unconditioned 
reflex responses. In terms of classical conditioning, learn¬ 
ing can be described as the association or pairing of a 
conditioned or neutral stimulus with an unconditioned 
stimulus (one that elicits an innate or reflex response). 
Thus, the neutral or conditioned stimulus acquires the 
ability to elicit the same response or behaviour produced 
by the unconditioned stimulus. The pairing of two un¬ 
related stimuli usually occurs by repeatedly presenting 
the neutral stimulus shortly before the unconditioned 
stimulus that elicits the innate response. In the context 
of classical conditioning in animals, the word "shortly" 
refers to a time interval of a few seconds (or in some 
cases a couple of minutes). On the other hand, at the 
cellular level and in the context of STDP, the association 
of stimuli encoded as synaptic spikes occurs in short 
milliseconds intervals [22]. 

2.4.3 Sensory system 

The neural circuit of the ant is able to process three 
types of sensorial information: (1) olfactory, (2) pain 
and (3) pleasant or rewarding sensation. The olfactory 
information is acquired through three olfactory receptors 
(see figure 2) where each one of them is sensitive to one 
specific smell represented with a different color (white, 
red or green). Each olfactory receptor is connected with 
one afferent neuron which propagates the input pulses 
towards the Motoneurons. Pain is elicited by a nocicep¬ 
tor whenever the insect collides with a wall or another 
obstacle. Finally, a rewarding or pleasant sensation is 
elicited when the insect gets in direct contact with a 
positive stimulus (i.e. food). 

2.4.4 Motor system 

The motor system allows the virtual ant to move forward 
or rotate in one direction according to the reflexive 
behaviour described below in Fig. [2] In order to keep 
the ant moving even in the absence of external stimuli, 
the motoneuron M is connected to a neural oscillator 
sub-circuit composed of two neurons HI and H2 (Fig. 
[2| performing the function of a pacemaker which sends 
a periodic pulse to M. The pacemaker is initiated by 
a pulse from an input neuron which represents an 
external input current (i.e; intracellular electrode). Fig. 
[2] illustrates the complete neural anatomy of the ant. 

2.4.5 Pheromone system 

The positive and negative pheromone systems are con¬ 
trolled by the two neurons Pp and Np respectively. The 
neuron Pp is activated by the reward sensor F resulting 
in the release of positive pheromone whenever the ants 


gets in contact with food (or any other positive stimulus 
associated with the activation of F). On the other hand, 
the neuron Np is activated by the summation of pulses 
coming periodically from the oscillator sub-circuit (neu¬ 
rons HI and H2). Np works as an energy-consumption 
counter which fires unless it is inhibited by the reward 
sensor F. Thus, whenever the ant gets in contact with 
food the energy counter is reinitialized. 

2.5 The Double Pheromone as the basis for collec¬ 
tive intelligence 

Traditional models of ant colony systems use mainly a 
positive pheromone feedback mechanism, meaning that 
they simulate a single chemical being secreted by the 
ants and that this chemical acts as an attractor for other 
ants to follow the pheromone trail. Recent findings on 
colonies of monomorium pharaonis ants show that this 
species uses a negative pheromone to help repel foragers 
from unrewarding areas of the landscape Il24l , [25]. This 
empirical evidence shows that this second chemical acts 
as a 'no entry' signal that ants deposit when they find 
unrewarding paths. 

The application of a double pheromone mechanism 
in artificial ant systems has been shown to improve 
the performance of ant colony optimisation problems, 
as the use of a negative 'no entry' pheromone reduces 
the exploration space in symetric travelling salesman 
problems |5|, [lT), (26), Ii27l . 

Following these ideas in the proposed model, ants 
explore the landscape according to rules established 
by the internal intelligence (or brain) dictated by the 
responses of the spiking neural network. The deposition 
of pheromones is controlled by the occurrence under two 
situations: 

Negative Pheromone deposition: 

Occurs after a threshold time since food was 
last found by the ant. 

Positive Pheromone deposition: 

Occurs immediately after an ant finds food and 
the deposition of positive pheromone persists 
for a parameterized amount of time after that. 

2.6 Implementation in Netlogo 

In Netlogo there are four types of agents: Turtles, 
patches, links and the observer (28) . The virtual ants 
are represented by turtle agents as well as each neuron 
in the implemented SNN model. Synapses on the other 
hand are represented by links. The produced pheromone 
is represented by patches. All simulated entities includ¬ 
ing the insect, neurons and synapses have their own 
variables and functions that can be manipulated using 
standard Netlogo commands. The Netlogo virtual world 
consists of a two dimensional grid of patches where 
each patch corresponds to a point (x,y) in the plane. 
In a similar way to the turtles, the patches own a set of 
primitives which allow the manipulation of their charac¬ 
teristics and also the programming of new functionalities 
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and their interaction with other agents. The visualization 
of the ants and their environment is done through the 
Netlogo's world-view interface. 

The virtual world of the ant is an ensemble of patches 
of four different colours, where each one of them is 
associated with a different type of stimulus. White and 
Red patches are both used to represent harmful stimulus. 
Thus, if the ant is positioned on a white or red patch, 
this will trigger a reaction in the ant's nociceptor (pain 
sensor) and its corresponding neural pathway (Fig. 2). 
On the other hand, green patches trigger a reaction in 
the reward sensor of the ants whenever it is positioned 
on one of them. Black patches represent empty spaces 
and do not trigger any sensory information in the ant at 
all. 



Fig. 3. Short trajectories at the beginning of the training 
phase. The ant collides and escapes the world repeatedly. 



Fig. 4. Long trajectory shows ant avoiding red and white 
patches. 

At the beginning of the training phase (Fig. [5} the ant 
moves along the virtual-world colliding indiscriminately 
with all types of patches, the ant is repositioned in its 
initial coordinates every time it reaches the virtual-world 
boundaries. As the training phase progresses it can be 
seen that the trajectories lengthen as the ant learns to 
associate the red and white patches with harmful stimuli 
and consequently to avoid them (Fig. |4]). 


Once the training phase has been completed, the con¬ 
ditioned aversion to white and red patches is exploited 
by using the white patches to delimit the boundaries of 
the virtual world of the ants ( using white as walls ) 
while the red patches are used to represent the negative 
pheromone released by the ant. 

3 Results 

Figures 5-8 illustrate a sequence of the ants' movements 
using the double pheromone. 



Fig. 5. No Pheromone. 

Fig-S shows the ants swarm moving through the 
virtual world. Since the virtual ants can only react to 
stimuli located directly in front of them , they manage to 
avoid the obstacles delimiting the virtual world (brown 
or white patches), however they are not able to detect 
the food sources located inside the virtual world. 



Fig. 6. After 200 iterations with activated Pheromone. 

Fig. [6] shows that when the ants start releasing the 
pheromone their otherwise monotonous trajectories are 
affected as the pheromone constitutes a new obstacle 
which is avoided thus creating several random trajec¬ 
tories for the ants. 

Fig .[7] and [8] shows that the ants follow new trajectories 
which allow them to find the food and eat it. At the same 












Fig. 7. After 1000 iterations with activated Pheromone. 



Fig. 8. After 8000 iterations with activated Pheromone 

time the negative pheromones released, by occupying 
previously empty spaces in the virtual world reduce 
the areas where the ants move to find food (the search 
space). Fig. [8] shows that after several iterations the food 
sources have been exhausted. The foraging activity is 
shown in Fig. [9] that illustrates the available food over 
time in the landscape. 



Fig. 9. Amount of available food during simulation with 
activated pheromone. 


4 Conclusion 

Combining internal and external forms of intelligence— 
or at least forms of individual and societal decision 
processes—is a challenging problem. It is one task that 
benefits from building on top of biological findings. In 


this work an agent based model of an Ant System was 
presented were both the individual ants and the colonies 
are organised based on biological principles. 

The model encompasses a combined synergy between 
internal individual intelligence and external collective 
intelligence. The former is presented in the form of a 
Spiking Neural Network while the latter is comprised of 
a double pheromone based space exploration and mass 
recruitment. 

It was shown how a Spiking Neural Network can be 
used to endow each ant with the ability to recognise 
rewarding and harmful stimuli. In this paper we demon¬ 
strated how associative learning in SNN can be used 
to allow accurate navigation of virtual ants in a two 
dimensional environment. Although the demonstrated 
association tasks are based on simplified action-reaction 
(sensor-actuator) relationships, it is possible to extend 
the neural circuit to associate more complex input pat¬ 
terns (i.e. bitmap images and other sensor arrays) with 
different types of actuators in order to produce more 
complex and intelligent behaviour. 

It was also shown how the results obtained by the 
individual ants are enhanced at the collective level 
by using a double pheromone mechanism. This self¬ 
organisation principle allows the collective—even in a 
small scale model as the one presented—to exhibit emer¬ 
gent features and to solve the problem of foraging by 
communicating through pheromone deposited in the 
landscape. This deposition acts as a memory—even if 
temporary, because of the action of evaporation—that 
allows the colony to improve their foraging task. 

The model depicts two forms of intelligence in an easy 
to use and easy to understand software package that 
introduces two important paradigms of artificial life to 
a vast community of scientists. 
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